Linköping University Post Print Integration of Tools for Binding Archetypes to SNOMED CT
نویسندگان
چکیده
Background: The Archetype formalism and the associated Archetype Definition Language have been proposed as an ISO standard for specifying models of components of electronic healthcare records as a means of achieving interoperability between clinical systems. This paper presents an archetype editor with support for manual or semi-automatic creation of bindings between archetypes and terminology systems. Methods: Lexical and semantic methods are applied in order to obtain automatic mapping suggestions. Information visualisation methods are also used to assist the user in exploration and selection of mappings. Results: An integrated tool for archetype authoring, semi-automatic SNOMED CT terminology binding assistance and terminology visualization was created and released as open source. Conclusion: Finding the right terms to bind is a difficult task but the effort to achieve terminology bindings may be reduced with the help of the described approach. The methods and tools presented are general, but here only bindings between SNOMED CT and archetypes based on the openEHR reference model are presented in detail. Background Standardisation efforts in health informatics, including HL7, CEN, ISO, openEHR and IHTSDO, have provided EHR information model specifications as well as reference terminologies aiming at semantic interoperability [1]. Tools have been provided for managing the artefacts involved such as archetype editors (see http:// www.openehr.org/) and terminology browsers [2,3]. Yet, tools that support the integrated use of terminology and information models are not widespread. This paper describes the integration of three applications related to archetypes and terminology systems, from First European Conference on SNOMED CT Copenhagen, Denmark. 1–3 October 2006 Published: 27 October 2008 BMC Medical Informatics and Decision Making 2008, 8(Suppl 1):S7 doi:10.1186/1472-6947-8-S1-S7 Selected contributions to the First European Conference on SNOMED CT Stefan Schulz and Gunnar O Klein Publication of this supplement was supported by EU Network of Excellence "Semantic Interoperability and Data Mining in Biomedicine" Proceedings http://www.biomedcentral.com/content/pdf/1471-6947-8-S1-info.pdf This article is available from: http://www.biomedcentral.com/1472-6947/8/S1/S7 © 2008 Sundvall et al; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Page 1 of 10 (page number not for citation purposes) BMC Medical Informatics and Decision Making 2008, 8(Suppl 1):S7 http://www.biomedcentral.com/1472-6947/8/S1/S7 a) an editor for archetype development, b) MoST; a system for selecting terms from SNOMED CT to be bound to archetypes, and c) TermViz; a tool for visualizing and navigating terminology systems. The 'archetype' approach to information modelling is introduced below and is followed by descriptions of the three applications and their integration. Modelling in openEHR The openEHR foundation http://www.openehr.org aims to facilitate interoperable implementations of electronic health record systems (EHRs), by developing and promoting open specifications and specifications-based implementations. The intention behind the specifications is to enable interoperability while still being flexible regarding information modelling design choices as well as choices of terminology systems, implementation technology, and human language translations. The architecture of openEHR aims to scale from small desktop systems for general practitioners to distributed patient centred lifelong-shared care health record systems [4]. The openEHR architecture [4] includes a design principle called 'Ontological separation', which regulates the EHR modelling; see Figure 1. The structure is divided into two main categories entitled 'ontologies of information' and 'ontologies of reality'. Please note that the words 'Ontological' and 'ontologies' come from the source [4], but that in our opinion, 'models' could be equivalent. The 'ontologies of information' contain the information models of the EHR content whereas the 'ontologies of reality' describe real phenomena with descriptions and classifications. The 'ontologies of information' are then divided into: • 'Domain content models' containing formal definitions of the clinical content. They can be developed using archetypes, which are designed to be easy to change when new clinical needs arise. Detailed openEHR archetype information, examples and resources are available from http:/ /www.openehr.org/clinicalmodels/archetypes.html • 'Information representation models' are implemented in the electronic health care systems software. They are used as a foundation for the domain content models and are designed to be stable with regards to model changes. In openEHR, this component is named the Reference model. The 'ontologies of reality' contain e.g.: • 'classifications', like ICDx and ICPC, • 'process descriptions', like clinical guidelines, • 'descriptive terminologies', like SNOMED CT. EHR extracts based on common shared archetypes are proposed as a means to exchange information between different health care providers [4]. Semantics of the domain content models (e.g. archetypes) are provided by terminology binding. Meaning of nodes in archetypes is given by textual descriptions and optionally by reference to external terminology systems: 1. term definition – a node of an archetype is given meaning through a name and textual description, 2. term binding – a node of an archetype is given meaning by reference to an external terminology. SNOMED CT SNOMED CT is the terminology system used for application in this paper. It is a clinical terminology based on concept representations that are related to each other by different types of relationships, like 'IS-A' (subtype), 'Part of', 'Causative agent' and many others. Each SNOMED CT concept representation is associated with a set of synonymous terms (coupled with metadata) called descriptions [5]. The number of active core concept representations in the January 2008 International Core release is 311 313. [6] Ontological structure Figure 1 Ontological structure. Illustration of openEHR's ontological structure. Adapted from [1] Ontologies of Information Ontologies of Reality Classifications ICDx, ICPC Domain Content Models archetypes mediate Process Descriptions Guidelines Information Representation Models openEHR reference model Descriptive Terminologies SNOMED CT Page 2 of 10 (page number not for citation purposes) BMC Medical Informatics and Decision Making 2008, 8(Suppl 1):S7 http://www.biomedcentral.com/1472-6947/8/S1/S7 Methods The applications for archetype editing, semi-automatic terminology binding and terminology visualization that have been integrated are briefly described in this section. The archetype editor Authoring of archetypes is not intended to be part of the daily routine of clinicians. Instead the goal is to develop archetypes that can be used in many different situations over a long period of time and to use them as parts of templates for clinical data entry. The purpose of the archetype editor is to let users build archetypes in an intuitive graphical environment, see Figure 2 without prior knowledge of formal representations of archetypes like the 'Archetype Definition Language' (ADL) or XML. We believe that an archetype editor that allows the user to create new archetypes and learn from previously created ones by viewing and exploring is important for developing good quality archetypes. The development of the Java based archetype editor at Linköping University[7], was focused on improving terminology binding support and usability. In relation to already existing editors, it also removed operating system dependencies. Connections to external terminology sources like SNOMED CT and UMLS were included so that the effort required to bind terms with the help of external terminology sources was reduced compared to manual lookup. The MoST system In order to bind nodes in clinical data models to nodes in external terminologies we must first find appropriate matches. The Model Standardisation using Terminology (MoST) system [8] developed at the University of Manchester is a general semi-automated mapping process providing the clinical modeller with candidate mappings. The mapping manually determined to be the most suitable can then be bound to a content model entity. The specific clinical data models selected to demonstrate the applicability of the methodology in this paper are archetypes according to the openEHR archetype model, and SNOMED CT is the terminology to which they have been mapped to. In the MoST mapping process as shown in Figure 3, archetypes are converted from ADL format to a general XML format designed to represent hierarchical data models. The clinical content of the model is then passed to the actual mapping process, which executes various lexical and semanticprocedures by referring to existing medical resources (detailed below) and SNOMED CT. The first round of mapping includes a lexical processing of terms using the Emergency Medical Text Processing (EMTP) service. It is a natural language processing (NLP) tool, which cleans up raw text entries [9]. EMT-P then looks for matches in the Unified Medical Language System (UMLS) resources and the UMLS LVG database, which consists of normalised word forms (see, http://umlsks.nlm.nih.gov/ ). The MoST methodology makes use of the lexical procedures of both the EMT-P tool and the UMLS resource at the same time to draw upon their individual and combined strengths to find relevant matches. All archetype terms, irrespective of whether they have found a match in the first round, are sent to the second round for normalisation. Normalisation involves execution of a series of lexical and semantic methods and collation of results from each. Some of the methods employed include a training dataset with commonly used clinical synonyms and abbreviations, and context search. An external NLP application named GATE http://gate.ac.uk was used for stemming, based on regular expression rules developed for its Morphological Analyzer, and synonym search using its WordNet http://wordnet.princeton.edu plugin. At the end of both the rounds, the collated results are subjected to elimination through filtering. All filtered SNOMED CT results are presented to the clinical modeller as candidate mappings. The filtering and evaluation details are described in [8] as it is beyond the scope of this paper. Briefly, filtering comprises of two main levels. The first is exclusion of all concepts subsumed by a parent concept occurring in the result set, and inclusion of all nonoccurring parent concepts if more than three child concepts are present in the result set. The second level involves inclusion of only those results whose semantic category (ies) is similar to the one specified by the clinical modeller. However, MoST provides for the possibility of a human and/or SNOMED CT categorisation error. The candidate mappings can be viewed in simple tabular form, in Figure 4, in the editor along with the facility to further explore the relevant SNOMED CT hierarchy using the visualization technique described below. See [10] for comprehensive information regarding MoST. Terminology visualization Large terminology systems with complex intertwined structure can be hard to navigate and get acquainted with. Free-text queries are possible entries into the exploration of such systems and the way results are presented has impact on the user's ability to grasp the overall structure of the system. Complex hierarchies like the one used in Page 3 of 10 (page number not for citation purposes) BMC Medical Informatics and Decision Making 2008, 8(Suppl 1):S7 http://www.biomedcentral.com/1472-6947/8/S1/S7 SNOMED CT, where nodes have multiple parents and several other relationship types, makes visualization challenging. A previous paper [3] presented a prototype, called TermViz, applying well-known methods from the fields of Information Visualization and Graph Drawing like 'focus+context' and self-organizing layouts. The user can simultaneously focus on several nodes in terminology systems and then use interactive animated graph navigation for further exploration without loosing context. 'Semantic zooming' i.e. reducing the amount of visible information, e.g. text labels far from focused nodes, is also available, see Figure 5. This part of the tool can also be used as a stand alone SNOMED CT browser. Updates regarding TermViz are available at http://www.imt.liu.se/~erisu/Term Viz/ Results In this section the integrated application is demonstrated using the blood pressure archetype, shown in the interface view of the editor illustrated in Figure 6 The definition view of the editor (see Figure 2) can be used to: • structure and name the fields in the archetype • mark fields as mandatory or optional • restrict format and kind of information to be allowed in a field In an archetype the 'fields' described above are nodes within a tree structure. Nodes can be bound to terminologies, such as SNOMED CT, as seen in Figure 4. The archeDefinition view Figure 2 Definition view. The definition view of the archetype editor. Page 4 of 10 (page number not for citation purposes) BMC Medical Informatics and Decision Making 2008, 8(Suppl 1):S7 http://www.biomedcentral.com/1472-6947/8/S1/S7 type is sent to the remote MoST-service (accessed using a SOAP-based Web service). In the tree structure to the left are labels ending with e.g. (14 SNOMED) indicating that MoST has found fourteen candidate mappings for the node. Upon selecting a node the suggestions are shown in the list at the bottom right of the screen. The SNOMED CT codes can be selected and 'bound' to the archetype node. A blue dot in front of a node shows that it has been bound to one or more terms in the currently selected terminology. Holding the cursor over a candidate mapping brings up a tool tip (the blue box) showing a short definition of the term. Free text queries for individual nodes can also be sent to UMLS or to a database containing SNOMED CT tables if locally available. Results from terminology services can be explored using visualization. On clicking the "Explore" button (Figure 4) an interactive graph opens, as visualized in Figure 5. The graph is constructed by climbing the hierarchy using the IS-A relations starting from the search results ending at the top concept. Other types of relations can also be explored by selecting any node. In addition to exploration, archetype bindings can be created from the graph view as well. The archetype editor download, and more information can be found at http://www.imt.liu.se/mi/ehr/ Discussion Archetype based systems have only been implemented and deployed in limited numbers yet http:www.openehr.org/shared-resources/usage/commer cial.html. We believe that semantic interoperability through the archetype approach will have greater chances of success if extensive bindings to terminologies are provided. Finding the right terms to bind is a difficult task but the effort to achieve terminology bindings may be reduced with the help of our methods and tools. The integrated editor eliminates the need for users to swap applications to find appropriate terminology entries. The mapping process is further assisted by the ability to get candidate mappings from MoST. Visually relating results from the terminology services (instead of only browsing a list) may assist the user in making the correct binding even if there are a large number of terms returned. Future work The term binding problem between two independent models (here the openEHR Reference model and SNOMED CT) and the logical control of post-coordination offer challenging tasks [11]. Post-coordination, i.e. the possibility to combine SNOMED CT concepts from different hierarchies, increases the logical complexity of the problem, e.g. combinations like an observable entity MoST Figure 3 MoST. The system methodology of MoST. Clinical Data Model E.g. Archetypes or HL7 models Generalised Hierarchy MoST XML Model transformation
منابع مشابه
Integration of tools for binding archetypes to SNOMED CT
BACKGROUND The Archetype formalism and the associated Archetype Definition Language have been proposed as an ISO standard for specifying models of components of electronic healthcare records as a means of achieving interoperability between clinical systems. This paper presents an archetype editor with support for manual or semi-automatic creation of bindings between archetypes and terminology s...
متن کاملMedical Terminology Browsers - How Usable Are Them for Describing Clinical Archetypes?
Clinical terminologies are a major concern in medical informatics, as they are key to provide medical systems with higher levels of interoperability. Large terminologies as SNOMED CT are gaining presence in practical applications. In a related but different direction, archetypes or data type templates are becoming widespread as interchange mechanisms for medical information. Archetypes support ...
متن کاملInteroperability of Data Models and Terminology Models : Issues with using the SNOMED CT terminology
Work in the field of recording standard, coded data in electronic health records and messages is important to support interoperability of clinical systems. It is also important for reducing medical errors caused by misinterpretation and misrepresentation of data. Standardisation of structured and unstructured data to one or more terminologies such as SNOMED-CT, or ICD requires the help of vario...
متن کاملSNOMED CT module-driven clinical archetype management
OBJECTIVE To explore semantic search to improve management and user navigation in clinical archetype repositories. METHODS In order to support semantic searches across archetypes, an automated method based on SNOMED CT modularization is implemented to transform clinical archetypes into SNOMED CT extracts. Concurrently, query terms are converted into SNOMED CT concepts using the search engine ...
متن کاملCombining lexical and structure-based methods to align clinical archetypes to SNOMED CT
Semantic interoperability of health systems will be only possible if clinical data models, such as OpenEHR Archetypes, are agreed by experts and aligned to standard terminology systems. In this paper we present an automated approach combining mapping algorithms to align clinical archetype terms to SNOMED CT concepts.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009